Semi-Supervised Learning to Identify UMLS Semantic Relations

نویسندگان

  • Yuan Luo
  • Ozlem Uzuner
چکیده

The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semantic entity pairs, and extracts multiple semantic, syntactic and orthographic features for the collected pairs. We experiment with seeded k-means clustering with various distance metrics. We create and annotate a ground truth corpus according to the top two levels of the UMLS semantic relation hierarchy. We evaluate our system on this corpus and characterize the learning curves of different clustering configuration. Using KL divergence consistently performs the best on the held-out test data. With full seeding, we obtain macro-averaged F-measures above 70% for clustering the top level UMLS relations (2-way), and above 50% for clustering the second level relations (7-way).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

OBJECTIVE To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity. DESIGN The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hie...

متن کامل

Clustering UMLS Semantic Relations Between Medical Concepts

We propose and implement an innovative semi-supervised framework for automatically discovering UMLS semantic relations. Our proposed framework uses semantic, syntactic and ortho-graphic features both at global level and local level. We experimented with multiple distance metric for clustering including Euclidean distance, spherical k-means distance, and Kullback-Leibler divergence. We show that...

متن کامل

Semi-Supervised Convolution Graph Kernels for Relation Extraction

Extracting semantic relations between entities is an important step towards automatic text understanding. In this paper, we propose a novel Semi-supervised Convolution Graph Kernel (SCGK) method for semantic Relation Extraction (RE) from natural English text. By encoding sentences as dependency graphs of words, SCGK computes kernels (similarities) between sentences using a convolution strategy,...

متن کامل

Automatic Identification of Treatment Relations For Medical Ontology Learning: An Exploratory Study

This study is part of a project to develop an automatic method to build ontologies, especially in a medical domain, from a document collection. An earlier study had investigated an approach to inferring semantic relations between medical concepts using the UMLS (Unified Medical Language System) semantic net. The study found that semantic relations between concepts could be inferred 68% of the t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014